Búsqueda | Base de datos de la OMS sobre COVID-19

GenBank 2023 update.

Sayers, Eric W; Cavanaugh, Mark; Clark, Karen; Pruitt, Kim D; Sherry, Stephen T; Yankie, Linda; Karsch-Mizrachi, Ilene.

Nucleic Acids Res ; 2022 Nov 09.

Artículo en Inglés | MEDLINE | ID: covidwho-2235299

RESUMEN

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 19.6 trillion base pairs from over 2.9 billion nucleotide sequences for 504 000 formally described species. Daily data exchange with the European Nucleotide Archive (ENA) and the DNA Data Bank of Japan (DDBJ) ensures worldwide coverage. Recent updates include resources for data from the SARS-CoV-2 virus, NCBI Datasets, BLAST ClusteredNR, the Submission Portal, table2asn, a Foreign Contamination Screening tool and BioSample.

Rapid automated validation, annotation and publication of SARS-CoV-2 sequences to GenBank.

Underwood, Beverly A; Yankie, Linda; Nawrocki, Eric P; Palanigobu, Vasuki; Gotvyanskyy, Sergiy; Calhoun, Vincent C; Kornbluh, Michael; Smith, Thomas G; Fleischmann, Lydia; Sinyakov, Denis; Bollin, Colleen J; Karsch-Mizrachi, Ilene.

Database (Oxford) ; 20222022 03 01.

Artículo en Inglés | MEDLINE | ID: covidwho-1713645

RESUMEN

Rapid response to the current coronavirus disease 2019 (COVID-19) pandemic requires fast dissemination of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) genomic sequence data in order to align diagnostic tests and vaccines with the natural evolution of the virus as it spreads through the world. To facilitate this, the National Library of Medicine's National Center for Biotechnology Information developed an automated pipeline for the deposition and quick processing of SARS-CoV-2 genome assemblies into GenBank for the user community. The pipeline ensures the collection of contextual information about the virus source, assesses sequence quality and annotates descriptive biological features, such as protein-coding regions and mature peptides. The process promotes standardized nomenclature and creates and publishes fully processed GenBank files within minutes of deposition. The software has processed and published 982 454 annotated SARS-CoV-2 sequences, as of 21 October 2021. This development addresses the needs of the scientific community as the sequencing of SARS-CoV-2 genomes increases and will facilitate unrestricted access to and usability of SARS-CoV-2 genomic sequence data, providing important reagents for scientific and public health activities in response to the COVID-19 pandemic. Database URL https://submit.ncbi.nlm.nih.gov/sarscov2/genbank/.

Asunto(s)

COVID-19 , SARS-CoV-2 , COVID-19/epidemiología , COVID-19/genética , Bases de Datos de Ácidos Nucleicos , Genoma Viral/genética , Humanos , Pandemias , SARS-CoV-2/genética

Future-proofing and maximizing the utility of metadata: The PHA4GE SARS-CoV-2 contextual data specification package.

Griffiths, Emma J; Timme, Ruth E; Mendes, Catarina Inês; Page, Andrew J; Alikhan, Nabil-Fareed; Fornika, Dan; Maguire, Finlay; Campos, Josefina; Park, Daniel; Olawoye, Idowu B; Oluniyi, Paul E; Anderson, Dominique; Christoffels, Alan; da Silva, Anders Gonçalves; Cameron, Rhiannon; Dooley, Damion; Katz, Lee S; Black, Allison; Karsch-Mizrachi, Ilene; Barrett, Tanya; Johnston, Anjanette; Connor, Thomas R; Nicholls, Samuel M; Witney, Adam A; Tyson, Gregory H; Tausch, Simon H; Raphenya, Amogelang R; Alcock, Brian; Aanensen, David M; Hodcroft, Emma; Hsiao, William W L; Vasconcelos, Ana Tereza R; MacCannell, Duncan R.

Gigascience ; 112022 02 16.

Artículo en Inglés | MEDLINE | ID: covidwho-1692222

RESUMEN

BACKGROUND: The Public Health Alliance for Genomic Epidemiology (PHA4GE) (https://pha4ge.org) is a global coalition that is actively working to establish consensus standards, document and share best practices, improve the availability of critical bioinformatics tools and resources, and advocate for greater openness, interoperability, accessibility, and reproducibility in public health microbial bioinformatics. In the face of the current pandemic, PHA4GE has identified a need for a fit-for-purpose, open-source SARS-CoV-2 contextual data standard. RESULTS: As such, we have developed a SARS-CoV-2 contextual data specification package based on harmonizable, publicly available community standards. The specification can be implemented via a collection template, as well as an array of protocols and tools to support both the harmonization and submission of sequence data and contextual information to public biorepositories. CONCLUSIONS: Well-structured, rich contextual data add value, promote reuse, and enable aggregation and integration of disparate datasets. Adoption of the proposed standard and practices will better enable interoperability between datasets and systems, improve the consistency and utility of generated data, and ultimately facilitate novel insights and discoveries in SARS-CoV-2 and COVID-19. The package is now supported by the NCBI's BioSample database.

Asunto(s)

COVID-19 , SARS-CoV-2 , Genómica , Humanos , Metadatos , Salud Pública , Reproducibilidad de los Resultados

GenBank.

Sayers, Eric W; Cavanaugh, Mark; Clark, Karen; Pruitt, Kim D; Schoch, Conrad L; Sherry, Stephen T; Karsch-Mizrachi, Ilene.

Nucleic Acids Res ; 50(D1): D161-D164, 2022 01 07.

Artículo en Inglés | MEDLINE | ID: covidwho-1546007

RESUMEN

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 15.3 trillion base pairs from over 2.5 billion nucleotide sequences for 504 000 formally described species. Recent updates include resources for data from the SARS-CoV-2 virus, including a SARS-CoV-2 landing page, NCBI Datasets, NCBI Virus and the Submission Portal. We also discuss upcoming changes to GI identifiers, a new data management interface for BioProject, and advice for providing contextual metadata in submissions.

Asunto(s)

Bases de Datos de Ácidos Nucleicos , Virus/genética , Genoma Viral , National Library of Medicine (U.S.) , SARS-CoV-2/genética , Estados Unidos , Interfaz Usuario-Computador

GenBank.

Sayers, Eric W; Cavanaugh, Mark; Clark, Karen; Pruitt, Kim D; Schoch, Conrad L; Sherry, Stephen T; Karsch-Mizrachi, Ilene.

Nucleic Acids Res ; 49(D1): D92-D96, 2021 01 08.

Artículo en Inglés | MEDLINE | ID: covidwho-1387961

RESUMEN

GenBank® (https://www.ncbi.nlm.nih.gov/genbank/) is a comprehensive, public database that contains 9.9 trillion base pairs from over 2.1 billion nucleotide sequences for 478 000 formally described species. Daily data exchange with the European Nucleotide Archive and the DNA Data Bank of Japan ensures worldwide coverage. Recent updates include new resources for data from the SARS-CoV-2 virus, updates to the NCBI Submission Portal and associated submission wizards for dengue and SARS-CoV-2 viruses, new taxonomy queries for viruses and prokaryotes, and simplified submission processes for EST and GSS sequences.

Asunto(s)

Biología Computacional/estadística & datos numéricos , Bases de Datos de Ácidos Nucleicos , Genómica/métodos , SARS-CoV-2/genética , Análisis de Secuencia de ADN/métodos , Animales , COVID-19/epidemiología , COVID-19/virología , Biología Computacional/métodos , Humanos , Almacenamiento y Recuperación de la Información/métodos , Internet , Anotación de Secuencia Molecular/métodos , Pandemias

VADR: validation and annotation of virus sequence submissions to GenBank.

Schäffer, Alejandro A; Hatcher, Eneida L; Yankie, Linda; Shonkwiler, Lara; Brister, J Rodney; Karsch-Mizrachi, Ilene; Nawrocki, Eric P.

BMC Bioinformatics ; 21(1): 211, 2020 May 24.

Artículo en Inglés | MEDLINE | ID: covidwho-687768

RESUMEN

BACKGROUND: GenBank contains over 3 million viral sequences. The National Center for Biotechnology Information (NCBI) previously made available a tool for validating and annotating influenza virus sequences that is used to check submissions to GenBank. Before this project, there was no analogous tool in use for non-influenza viral sequence submissions. RESULTS: We developed a system called VADR (Viral Annotation DefineR) that validates and annotates viral sequences in GenBank submissions. The annotation system is based on the analysis of the input nucleotide sequence using models built from curated RefSeqs. Hidden Markov models are used to classify sequences by determining the RefSeq they are most similar to, and feature annotation from the RefSeq is mapped based on a nucleotide alignment of the full sequence to a covariance model. Predicted proteins encoded by the sequence are validated with nucleotide-to-protein alignments using BLAST. The system identifies 43 types of "alerts" that (unlike the previous BLAST-based system) provide deterministic and rigorous feedback to researchers who submit sequences with unexpected characteristics. VADR has been integrated into GenBank's submission processing pipeline allowing for viral submissions passing all tests to be accepted and annotated automatically, without the need for any human (GenBank indexer) intervention. Unlike the previous submission-checking system, VADR is freely available (https://github.com/nawrockie/vadr) for local installation and use. VADR has been used for Norovirus submissions since May 2018 and for Dengue virus submissions since January 2019. Since March 2020, VADR has also been used to check SARS-CoV-2 sequence submissions. Other viruses with high numbers of submissions will be added incrementally. CONCLUSION: VADR improves the speed with which non-flu virus submissions to GenBank can be checked and improves the content and quality of the GenBank annotations. The availability and portability of the software allow researchers to run the GenBank checks prior to submitting their viral sequences, and thereby gain confidence that their submissions will be accepted immediately without the need to correspond with GenBank staff. Reciprocally, the adoption of VADR frees GenBank staff to spend more time on services other than checking routine viral sequence submissions.

Asunto(s)

Betacoronavirus , Infecciones por Coronavirus , Bases de Datos de Ácidos Nucleicos , Anotación de Secuencia Molecular , Pandemias , Neumonía Viral , Programas Informáticos , Betacoronavirus/genética , COVID-19 , Infecciones por Coronavirus/genética , Virus ADN , Genómica , Humanos , Anotación de Secuencia Molecular/normas , Neumonía Viral/genética , SARS-CoV-2 , Virus

RESUMEN

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

RESUMEN

Asunto(s)

ENVIAR RESULTADO:

SELECCIÓN DE REFERENCIAS

DETALLE DE LA BÚSQUEDA